Optimizing Cepstral Features for Audio Classification
نویسندگان
چکیده
Cepstral features have been widely used in audio applications. Domain knowledge has played an important role in designing different types of cepstral features proposed in the literature. In this paper, we present a novel approach for learning optimized cepstral features directly from audio data to better discriminate between different categories of signals in classification tasks. We employ multi-layer feedforward neural networks to model the cepstral feature extraction process. The network weights are initialized to replicate a reference cepstral feature like the mel frequency cepstral coefficient. We then propose a embedded approach that integrates feature learning with the training of a support vector machine (SVM) classifier. A single optimization problem is formulated where the feature and classifier variables are optimized simultaneously so as to refine the initial features and minimize the classification risk. Experimental results have demonstrated the effectiveness of the proposed feature learning approach, outperforming competing methods by a large margin on benchmark data.
منابع مشابه
Determining the effective features in classification of heart sounds using trained intelligent network and genetic algorithm
Heart diseases are among the most important causes of mortality in the world, especially in industrial countries. Using heart sounds and the features extracted from them are among the non-aggressive diagnosis and prognosis methods for heart diseases. In this study, the time-scale, Cepstral, frequency, temporal and turbulence features are saved and extracted from the heart sounds, and then they ...
متن کاملAcoustic Emotion Recognition Using Linear and Nonlinear Cepstral Coefficients
Recognizing human emotions through vocal channel has gained increased attention recently. In this paper, we study how used features, and classifiers impact recognition accuracy of emotions present in speech. Four emotional states are considered for classification of emotions from speech in this work. For this aim, features are extracted from audio characteristics of emotional speech using Linea...
متن کاملCombining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملExperimental evaluation of features for robust speaker identification
This paper presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture densities is used as the classifie...
متن کاملRecognition and Classification of Human Emotion from Audio
In this paper, the audio emotion recognition system is proposed that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio paths. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). Me...
متن کامل